NAS A/TM-20 14-2 18285 



A Self-Stabilizing Hybrid-Fault Tolerant 
Synchronization Protocol 


Mahyar R. Malekpour 

Langley Research Center, Hampton, Virginia 


July 2014 




NASA STI Program ... in Profile 


Since its founding, NASA has been 
dedicated to the advancement of aeronautics 
and space science. The NASA scientific and 
technical infonnation (STI) program plays a 
key part in helping NASA maintain this 
important role. 

The NASA STI program operates under the 
auspices of the Agency Chief Information Officer. It 
collects, organizes, provides for archiving, and 
disseminates NASA’s STI. The NASA STI program 
provides access to the NASA Aeronautics and Space 
Database and its public interface, the NASA Technical 
Report Server, thus providing one of the largest 
collections of aeronautical and space science STI in 
the world. Results are published in both non-NASA 
channels and by NASA in the NASA STI Report 
Series, which includes the following report types: 

• TECHNICAL PUBLICATION. Reports of 
completed research or a major significant phase 
of research that present the results of NASA 
programs and include extensive data or 
theoretical analysis. Includes compilations of 
significant scientific and technical data and 
information deemed to be of continuing 
reference value. NASA counterpart of peer- 
reviewed formal professional papers, but having 
less stringent limitations on manuscript length 
and extent of graphic presentations. 

• TECHNICAL MEMORANDUM. Scientific 
and technical findings that are preliminary or of 
specialized interest, e.g., quick release reports, 
working papers, and bibliographies that contain 
minimal annotation. Does not contain extensive 
analysis. 

• CONTRACTOR REPORT. Scientific and 
technical findings by NASA-sponsored 
contractors and grantees. 


• CONFERENCE PUBLICATION. Collected 
papers from scientific and technical 
conferences, symposia, seminars, or other 
meetings sponsored or co-sponsored by NASA. 

• SPECIAL PUBLICATION. Scientific, 
technical, or historical information from NASA 
programs, projects, and missions, often 
concerned with subjects having substantial 
public interest. 

• TECHNICAL TRANSLATION. English- 
language translations of foreign scientific and 
technical material pertinent to NASA’s mission. 

Specialized services also include creating custom 

thesauri, building customized databases, and 

organizing and publishing research results. 

For more information about the NASA STI 

program, see the following: 

• Access the NASA STI program home page at 
http://www. sti. nasa. gov 

• E-mail your question via the Internet to 
help@sti.nasa.gov 

• Fax your question to the NASA STI Help Desk 
at 443-757-5803 

• Phone the NASA STI Help Desk at 
443-757-5802 

• Write to: 

NASA STI Help Desk 

NASA Center for AeroSpace Information 

7115 Standard Drive 

Hanover, MD 21076-1320 


NAS A/TM-20 14-2 18285 



A Self-Stabilizing Hybrid-Fault Tolerant 
Synchronization Protocol 


Mahyar R. Malekpour 

Langley Research Center, Hampton, Virginia 


National Aeronautics and 
Space Administration 

Langley Research Center 
Hampton, Virginia 23681-2199 


July 2014 




Available from: 


NASA Center for AeroSpace Information 
7115 Standard Drive 
Hanover, MD 21076-1320 
443-757-5802 





Abstract 


Abstract - In this report we present a strategy for solving the Byzantine general problem for self- 
stabilizing a fully connected network from an arbitrary state and in the presence of any number 
of faults with various severities including any number of arbitrary (Byzantine) faulty nodes. Our 
solution applies to realizable systems, while allowing for differences in the network elements, 
provided that the number of arbitrary faults is not more than a third of the network size. The 
only constraint on the behavior of a node is that the interactions with other nodes are restricted to 
defined links and interfaces. Our solution does not rely on assumptions about the initial state of 
the system and no central clock nor centrally generated signal, pulse, or message is used. Nodes 
are anonymous, i.e., they do not have unique identities. We also present a mechanical 
verification of a proposed protocol. A bounded model of the protocol is verified using the 
Symbolic Model Verifier (SMV). The model checking effort is focused on verifying correctness 
of the bounded model of the protocol as well as confirming claims of determinism and linear 
convergence with respect to the self-stabilization period. We believe that our proposed solution 
solves the general case of the clock synchronization problem. 
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1. Introduction 


Distributed systems have become an integral part of safety-critical computing applications, 
necessitating system designs that incorporate complex fault-tolerant resource management 
functions to provide globally coordinated operations with ultra-reliability. As a result, robust 
clock synchronization has become a required fundamental component of fault-tolerant safety- 
critical distributed systems. Since physical oscillators are inherently imperfect, local clocks of 
nodes of a distributed system, driven by these oscillators, do not keep perfect time and can drift 
with respect to real time and one another. Thus, the local clocks of the nodes must periodically 
be resynchronized. As a result, a fault-tolerant system needs a clock synchronization algorithm 
that tolerates imprecise local clocks and faulty behavior by some processes. In this paper we 
present a strategy for synchronizing distributed systems in the presence of various faults, 
including any number of arbitrary (Byzantine) faults. 

We define synchronization of a distributed system as the process of achieving and maintaining 
a bounded skew among independent local clocks by exchanging local time infonnation. A 
distributed system is defined to be self-stabilizing if, from an arbitrary initial state, it is 
guaranteed to reach a legitimate state in a finite amount of time and remain in a legitimate state. 
For clock synchronization, a legitimate state is a state where all parts in the system are in 
synchrony. 

The self-stabilizing distributed-system clock synchronization problem is to develop an algorithm 
(i.e., a protocol) to achieve and maintain synchrony of local clocks in a distributed system after it 
experiences system-wide disruptions in the presence of network element imperfections. 
Hereafter in this report, we use the term synchronization to mean self-stabilizing clock 
synchronization in distributed systems. 

Besides being an intellectual curiosity and a theoretical problem in computer science and 
engineering, synchronization has practical significance as a fundamental service for higher-level 
algorithms that solve other problems. For example, in safety-critical TDMA (Time Division 
Multiple Access) architectures [Kop 1997][Min 2002][Tor 2005A, 2005B], synchronization is 
the most crucial element of these systems. 

There is a vast literature on synchronization phenomena exhibited by humans, animals, and even 
inanimate objects. There are also many proposed solutions for synchronization of a large 
number of entities based on models inspired by nature or abstract ideas. There exist many 
solutions for special cases and restricted conditions. In the context of synchronization, the 
convergence and closure properties address achieving and maintaining network synchrony, 
respectively (see Section 3.3 for a formal definition of these parameters). There are many 
solutions that deal with the closure property [Lam 1985] [Sri 1987] [Wei 1988] which either do 
not address convergence or provide an ad hoc solution [Dav 1978] for initialization and 
integration, separately. Typically, the assumed topology is a regular graph 1 such as a fully 
connected graph or a ring. Although these topologies do not necessarily correspond to practical 
applications or biological, social, or technical networks, nevertheless, they provide a base case to 


1 A regular graph is a graph where each vertex has the same number of neighbors, i.e., every vertex has the same 
degree or valency. A regular graph with vertices of degree k is called a Lregular graph or regular graph of degree k. 
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solve the distributed synchronization problem. Furthennore, the existing models and solutions 
do not always achieve synchrony and, therefore, do not solve the general case of the distributed 
synchronization problem. Furthennore, even when the solutions achieve synchrony, the time to 
achieve synchrony is very large for many of the solutions. 

Another key factor in a proposed solution is whether or not it deals with faults. A fault is a 
defect or flaw in a system component resulting in an incorrect state [Tor 2005A][But 2008], The 
requirement to handle faults adds a new dimension to the complexity of the synchronization of 
fault-tolerant distributed systems. A fundamental property of a robust distributed system is the 
capability of tolerating and potentially recovering from failures that are not predictable in 
advance. In [Lam 1982, 1985] various ideas for overcoming failures in a robust distributed 
system are addressed that include tolerating Byzantine faults. There are many algorithms that 
address pennanent faults [Sri 1987], where the issue of transient failures is either ignored or 
inadequately addressed. There are many efficient Byzantine clock synchronization algorithms 
proposed that are based on assumptions on initial synchrony of the nodes [Sri 1987] [Wei 1988] 
or existence of a common pulse at the nodes, e.g., the first protocol in [Dol 2004]. There are also 
many clock synchronization algorithms that are based on randomization and, therefore, are non- 
detenninistic, e.g., the second protocol in [Dol 2004]. 

A thorough understanding of the synchronization of a distributed system has proven to be elusive 
for decades. The main challenge associated with distributed synchronization is the complexity of 
developing a correct and verifiable solution. It is possible to have a solution that is hard to 
prove or refute. Such a solution, however, is not likely to be accepted or used in practical 
systems. The proposed solutions must restore synchrony and coordinated operations after 
experiencing system-wide disruptions in the presence of network element imperfections and, for 
ultra-reliable distributed system, in the presence of various faults. In addition, a proposed 
solution must be proven to be correct. In the absence of a paper-and-pencil proof, the use of 
fully automated fonnal methods techniques is a viable alternative. In [Mai 2006A] a 
counterexample is presented to a clock synchronization algorithm [Dal 2003] that is based on the 
existence of a common pulse at the nodes. Furthennore, addressing network element 
imperfections, such as oscillator drift with respect to real time and differences in the lengths of 
the physical communication media, is necessary to make a solution applicable to realizable 
systems. 

Two Byzantine-fault-tolerant self-stabilizing protocols for distributed systems were reported in 
[Mai 2006B] and [Mai 2009]. Instances of these protocols were demonstrated via mechanical 
verification to self-stabilize from any state, in the presence of at most one pennanent Byzantine 
faulty node, and to deterministically converge in linear time with respect to the synchronization 
period [Mai 2008]. These protocols, however, do not solve the general case of the problem in 
the presence of multiple Byzantine faults [Mai 2009]. 

Drawing from our prior experience, in this report we present a strategy for solving the Byzantine 
general problem. Our solution self-stabilizes a fully connected network from an arbitrary initial 
state and in the presence of any number of arbitrary (Byzantine) faulty nodes, for realizable 
systems, while allowing for differences in the network elements, provided that the number of 
arbitrary faults is not more than a third of the network size [Lam 1982, 1985] [Dol 1984]. The 
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main problem in the self-stabilization problem is a lack of a symmetric view of the system across 
all good (non-faulty) nodes (processors). What if this issue is somehow resolved? Can the 
system self-stabilize in the presence of symmetric faults? A fault is symmetric when all good 
nodes observe consistent error manifestations, but do not know that it is bad [Min 2002], Thus, 
the crux of the solution presented in this report is to 1) first convert any message to a symmetric 
message and, 2) use a verified protocol that is based on a message symmetry assumption to solve 
the synchronization problem. 

There are a number of ways of achieving message symmetry across the system. The Interactive 
Consistency (IC) algorithm [Pea 1980], for instance, can be used to transform a message, 
including an asymmetric message, to a symmetric message, whereby the good nodes collectively 
either accept or reject it symmetrically (an agreement) within a time bound. Other methods 
include using variety of engineering practices, for example, using self-checking pair at the node 
level [Hoy 1992][ARI 1993] or central guardian at the system level [Kop 1993][Bau 2003]. 

In this report, we present a protocol (algorithm) that tolerates symmetric faults, provided that 
there are more good nodes than faulty ones. We also present the model checking results of a 
bounded model of the protocol that was used to validate the correctness of the protocol as it 
applies to fully connected networks and confirmed the claims of detenninism and linear 
convergence. Our solution applies equally well to any method that can guarantee message 
symmetry across all receiving good nodes. 

This report is organized as follows. In Section 2 we provide a system overview. We present the 
protocol and its description in Section 3. In Section 4 we present the model checking efforts 
toward verification of correctness of a bounded model of the protocol and the results of that 
effort. Finally, we present concluding remarks in Section 5 and enumerate some possible 
applications. 


2. System Overview 

We considered a system of pulse-coupled entities (e.g., oscillators, pacemaker cells) pulsating 
periodically at regular time intervals. These entities are said to be coupled through some 
physical means (wire or fiber cables, chemical process, or wirelessly through air or vacuum) that 
allows them to influence each other. We modeled the system as a graph with a set of nodes 
(vertices) that represent the pulse-coupled entities and a set of communication links (edges) that 
represent their interconnectivity. 

The underlying topology considered is a fully connected network of K > 1 nodes that exchange 
messages through a set of communication links. Nodes are anonymous, i.e., they do not have 
unique identities. The system consists of a set of good nodes and a set of faulty nodes. A good 
node is assumed to actively participate in the synchronization process and correctly execute the 
protocol. A faulty node is either benign (detectably bad), symmetric, or arbitrary (Byzantine). 
We define a faulty node from the perspective of a source node (sender). A maximum of F faulty 
nodes are assumed to be present in the system, where F > 0. The minimum number of good 
nodes in the system, G, is defined by G = K - F nodes. We denote the maximum number of 
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detectably bad nodes by F», symmetrically bad nodes by F s , arbitrarily (Byzantine) bad nodes by 
F a , and the maximum number of bad nodes by F = F D + F s + F A . The communication links are 
assumed to connect a set of source nodes to a set of destination nodes with a source node being 
different than a destination node, furthermore, we assume no physical self-loop link from the 
node back to itself. We attribute a faulty link behavior to its source node. Therefore, all 
communication li nk s are assumed to be good, i.e., reliably transfer data from their source nodes 
to their destination nodes. The nodes communicate with each other by exchanging broadcast 
messages. Broadcast of a message by a node is realized by transmitting the message, at the same 
time, to all nodes that are directly connected to it. The communication network does not 
guarantee any relative order of arrival of a broadcast message at the receiving nodes, that is, a 
consistent delivery order of a set of messages does not necessarily reflect the temporal or causal 
order of the message transmissions [Kop 1997]. There is neither a central system clock nor an 
externally-generated global pulse or message at the network level. The communication li nk s and 
nodes can behave arbitrarily, provided that the system eventually adheres to the protocol 
assumptions (Section 3.3). 

2.1. Drift Rate 

Each node is driven by an independent, free-running local physical oscillator (i.e., the phase is 
not controlled in any way) and two clocks (i.e., counters), denoted StateTimer and LocalTimer, 
which locally keep track of the passage of time and are driven by the local physical oscillator. 
An oscillator tick, also called a clock tick, is a discrete event and the basic unit of time in the 
network [Tor 2005 A]. 

An ideal oscillator has zero drift rate with respect to real time, perfectly marking the passage of 
time. Real oscillators are characterized by non-zero drift rates with respect to real time. The 
oscillators of the nodes are assumed to have a known bounded drift rate, p, where p is a constant, 
unitless, non-negative real value and is constrained to 0 < p « 1. The maximum drift of the 
fastest clock of a good node over a time interval of t is given by ( 1 + p)t. The maximum drift of 
the slowest clock of a good node over a time interval of t is given by (1/(1 + p))t. 
Therefore, the relative drift of the fastest and slowest good nodes is (1 + p)t - (1/(1 + p))t. 

In simulation and model checking, typically time is modeled to reflect real time with a certain 
accuracy, and the drift of a node is measured with respect to that model of time. In a distributed 
system, addressing clock accuracy is orthogonal to achieving and maintaining synchrony which 
is a measure of the relative precision of the good nodes. Thus, in the context of a correctness 
proof of a distributed protocol, only the relative drift of the good nodes is considered. 

2.2. The Clocks 

Each node has two primary clocks, StateTimer and LocalTimer, which locally keep track of the 
passage of time and are driven by the node’s local physical oscillator. The StateTimer is used for 
operations local to the node as they relate to achieving and maintaining synchrony among the 
good nodes. The LocalTimer is used to properly filter out inherent deviation in the StateTimer 
during the resynchronization process (to be defined shortly) by providing a jitter-free clock to 
the higher level protocols. The LocalTimer is also used in assessing the state of the system from 
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an external perspective. Activities of the StateTimer and LocalTimer of a node during steady 
state are depicted in Figure 1 . 

Resynchronization 

Process 



Figure 1. Activities of a good node during steady state. 

The StateTimer takes on discrete values and is a monotonic linear function increasing from an 
initial value to a maximum value. The synchronization period during steady state, denoted P ST , 
is defined as the largest time interval between any two consecutive resets of the StateTimer by a 
good node. As shown in Figure 1, if uninterrupted, the StateTimer periodically takes on all 
discrete values from its initial value, 0, to its maximum value, Pst, linearly increasing within 
each period and is bounded by 0 < StateTimer < P S t- 

The LocalTimer is also driven by the local physical oscillator, takes on discrete values, and 
locally keeps track of the passage of time. The LocalTimer is a monotonic linear function 
increasing from an initial value to a maximum value. As shown in Figure 1, if uninterrupted, the 
LocalTimer periodically takes on all discrete values from its initial value, 0, to its maximum 
value, P L t, linearly increasing within each period and is bounded by 0 < LocalTimer < P LT . 

These logical clocks need to be periodically synchronized due to the inherent drift in their local 
physical oscillators. In order to achieve synchronization, the nodes communicate by exchanging 
Sync messages. The periodic synchronization during steady state is referred to as the 
resynchronization process which starts when the first good node begins to transmit a burst of 
consecutive Sync messages and ends after the last occurrence of consequent accept event at a 
good node. An accept event occurs when a good node receives a sufficient number of Sync 
messages from as many good nodes. The sufficiency of Sync messages is a function of the type 
and number of faults being tolerated. An upper bound on the duration of the resynchronization 
process will be determined later in this report. 

As discussed previously, the LocalTimer is intended to be used by higher level protocols, and it 
must be managed properly to provide the desired monotonically increasing value between 
adjustments and despite inherent deviation in the StateTimer. The LocalTimer is incremented 
once every local clock tick and is reset either when it reaches its maximum allowed value, Plt, or 
when the StateTimer of the node has reached ResetLocalTimerAt, where ResetLocalTimerAt is 
constrained by the following inequality: 

T 7Zinit\ < ResetLocalTimerAt < Pst- M (1) 
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Where [1 is the ceiling function, n Mt is the initial network precision after a resynchronization 
process, and n is the upper bound on the guaranteed precision. 

The guaranteed synchronization precision, denoted n, is the guaranteed upper bound on the 
maximum separation between the LocalTimers of any two good nodes. The initial precision, 
Jiinit, is the maximum difference between StateTimers of any two good nodes upon completion of 
the resynchronization process. The ResetLocalTimerAt can be given any value in the range 
specified in inequality (1). However, the value must be the same at all good nodes. In this 
inequality, the lower bound indicates when all good nodes have reset their StateTimers and the 
upper bound indicates when the first good node might time out and begin the next round of 
resynchronization process. We choose the earliest such value, ResetLocalTimerAt = \ to 
reset the LocalTimer of all good nodes. Any value greater than T 7T„„,1 will prolong the 
convergence time. The convergence time, denoted C, is defined as the bound on the maximum 
time it takes the network to achieve the guaranteed precision tz. 

2.3. Communication Delay 

The communication delay between directly connected (adjacent) nodes is expressed in terms of 
the minimum event-response delay, D, and network imprecision, cl. These parameters are 
described with the help of Figure 2. 



Figure 2. Event-response delay, D, and network imprecision, d. 

As depicted in this figure, a message transmitted by node A at real time to is expected to arrive at 
its directly connected adjacent nodes (B, C, ...), be processed, and subsequent messages to be 
generated by those nodes within the time interval [to + D, to + D + d]. Communication between 
independently-clocked nodes is inherently imprecise. The network imprecision, d, is the 
maximum time difference among all receivers of a message from a transmitting node with 
respect to real time. The imprecision is due to many factors including, but not limited to, the 
drift of the oscillators with respect to real time, jitter, discretization error, temperature effects and 
differences in the lengths of the physical communication media. These two parameters are 
assumed to be bounded such that D > 0 and d > 0 and both have units of real-time clock ticks. 
The communication delay, denoted y, is expressed in terms of D and cl, is defined as y = D + d, 
and has units of real-time clock ticks. Therefore, the communication delay between any two 
directly connected adjacent nodes is bounded by [ D , y\. 

Although from an external perspective, the value of D and cl, and hence y, are real numbers, 
locally and at the node level, they are treated as discrete values. In other words, and in the rest of 
the report, from the local perspective of a node, D = \D\, cl = [ cl], and y= D + d. 
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3. Protocol Description 


In this section we provide an intuitive description of the protocol behavior followed by a detailed 
description. In order to achieve synchronization, the nodes communicate by exchanging Sync 
messages. Nodes periodically undergo a new round of the resynchronization process. When a 
node’s StateTimer times out, it initiates a new round of the resynchronization process by 
broadcasting a continual burst of (once per y) Sync messages to all other nodes that are directly 
connected to it. During this process, the StateTimer is at its maximum and remains constant, i.e., 
the node neither increments nor resets its StateTimer. This process continues until all good 
nodes participate in the resynchronization process and converge to the guaranteed precision n. A 
good node uses its own message. When a good node receives a sufficient number of Sync 
messages from as many good nodes, an accept event occurs. The sufficiency of Sync messages 
is a function of the type and number of faults being tolerated. When an accept event occurs, the 
node ends its continual broadcast and concludes the resynchronization process by resetting its 
StateTimer. Note that, as depicted in Figure 1, consecutive accept events may occur during a 
resynchronization process. 

Since, due to drift, the duration of the resynchronization process is potentially more than y, and 
during this process, the StateTimer holds a constant value (its maximum), it not a reliable clock 
source for higher level protocols. Thus, the LocalTimer is introduced. To provide the desired 
monotonically increasing clock source to higher-level protocols, the LocalTimer gets reset when 
the StateTimer has reached a predefined value greater than or equal to the guaranteed initial 
network precision, K init . 

3.1. The Sync Message And Its Validity 

We defined earlier that synchrony is a measure of the relative precision of the good nodes. In 
order to achieve and maintain desired synchrony, the nodes communicate by exchanging Sync 
messages 2 . Assuming physical-layer error detection is dealt with separately, the reception of a 
Sync message is indicative of its validity in the value domain. Upon starting a new round of the 
resynchronization process, the node continually sends out Sync messages, once per y, to other 
nodes that are connected to it. Therefore, we say that a Sync message has a life-span, and we 
limit the life-span of the Sync message at the receiving nodes to be y. A Sync message from a 
given source is valid if it arrives at or after one D of an immediately preceding Sync message 
from that source, that is, the valid message in the value domain, i.e., valid Sync messages are 
rate-constrained. Again assuming physical-layer error detection is dealt with separately, the 
reception of a Sync message is indicative of its validity in the value and time domains. Since a 
good node uses its own message, but we assume no physical self-loop link from the node back to 
itself, the message becomes valid only after y giving the own message the longest (worst case) 
transmission delay time. A valid Sync message becomes invalid after its life-span expires. 
However, while valid, the Sync message may be used multiple times and result in multiple accept 
events. 


2 Since only one message type is used for the operation of this protocol, a single bit suffices. 
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3.2. The Monitor, The Synchronizer, And Protocol Functions 

A node consists of a synchronizer and a set of monitors. To assess the behavior of other nodes, 
a node employs as many monitors as the number of nodes that are directly connected to it, with 
one monitor for each source of incoming message. A node uses but does not monitor, its own 
message. The message is kept within the node and, therefore, no physical self-loop link back to 
itself. A monitor keeps track of the activities of its corresponding source node. Specifically, a 
monitor reads, evaluates, validates, and stores the last valid message it receives from that node. 
A valid Sync message is then conveyed to the local synchronizer. The assessment results of the 
monitored nodes are then utilized by the synchronizer in the synchronization process. A monitor 
disposes of the valid message after its life-span expires. 

The function ValiclateMessage() (shown in Figure 3) is used by the monitors to determine 
whether a received Sync message meets the minimum timing requirement, and thus is valid in 
both value and time domains, and whether a stored valid Sync message has reached its lifespan 
and expired. The function Accept( ), used by the synchronizer, examines availability of sufficient 
valid Sync messages. The sufficiency of available, valid messages, denoted by T A , is a function 
of the type and number of faults to be tolerated. T A = F D + 1 and T A = F s + 1 . For tolerating 
benign and symmetric faults, respectively, the following relations must hold: T A = F D + F s + 1 
for tolerating F D + F s simultaneous faults. When a sufficient number of messages have been 
received, the Accept() function returns a Boolean value of true. 


ValidateMessageQ : 

if ( incoming message = Sync) and ( MessageTimer > D) 
MessageValid = true, // store it, 

MessageTimer = 0, 

elseif ( MessageTimer > MessageLifeSpan ) 

MessageValid = false, // it expired 

elseif ( MessageTimer < MessageLifeSpan ) 
MessageTimer = MessageTimer + 1 . 

Accept! ): 

if ( n umber of stored Sync messages > T A ) 
return true, 

else 

return false. 


Figure 3. The protocol functions. 
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3 . 3 . Protocol Assumptions 

1 . The topology is a fully connected graph 3 . 

2. The number of nodes constituting the network is K, where K > 2 Fs + Fp + 1 , and Fs is the 
maximum number of symmetrically bad nodes. 

3. Nodes either correctly execute the protocol and are good, symmetrically bad, F s or 
detectably bad, F D . 

4. Links are bidirectional and correctly transmit data from their sources to their destinations. 

5. The bound on the oscillator drift rate is p, where 0 <p« 1 . 

6. A message sent by a node will be received and processed by its directly-connected, 
adjacent nodes within y, where ;/ = (D + d). 

7. Physical-layer error detection is dealt with separately, thus, the reception of a Sync 
message is indicative of its validity in the value and time domains. 

3 . 4 . The Self-Stabilizing Distributed Clock Synchronization Problem 

To simplify the presentation of the solution, it is assumed that all time references are with respect 
to an initial real time to, where to = 0, and for all t >t 0 the system operates within the protocol 
assumptions. The maximum difference in the value of LocalTimer for all pairs of nodes at time 
t, A Net (t), is detennined by the following equation that accounts for the variations in the values of 
the LocalTimer across all good nodes. 

r = \k ( 1 + p) \ is a time interval encompassing k, 

LocalTimer m i„(t) = min (Ni.LocalTimer(t)), for all i, and 
LocalTimer max (t) = max (Nj.LocalTimer(t)), for all i. 

ANet(t)= min ((LocalTimer max (t) - LocalTimer m i n (t)), 

(LocalTimer max (t - r) - LocalTimer m i„(t - r))). 

The following symbols were defined earlier and are listed here for convenience: 

• P LT has units of real time clock ticks, and is defined as the upper bound on the time 
interval between any two consecutive resets of the LocalTimer by a node and Plt > 0. 

• ANet(t), for real time t, is the maximum difference of values of the LocalTimers of any two 
nodes (i.e., the relative clock skew) for t > t 0 . 

• 7t, the synchronization precision, is the guaranteed upper bound on A Net ( t) for all t> C, 
0 < 7C « Plt- 

• C, the convergence time, is defined as the bound on the maximum time for the network to 
achieve the guaranteed precision it. 

To prove that a protocol is self-stabilizing, it has to be shown that there exist C and tt such that 
the following self-stabilization properties hold. 

1. Convergence: A Ne ,(C) < n, 0 < n « P LT 

2. Closure: For all t > C, A Net (t) < z 

3 . Congruence: For all nodes N f , for all t> C, (N i.LocalTimer(t) = T zrl) ^ A Net (t) < n. 


3 We leave generalization of the topology for future work. 
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4. Liveness: For all t > C, LocalTimer of every node sequentially takes on at least all 

discrete values in [0, P ST - n - y\, see Figures 1 and 6. a. 

3.5. What Self-Stabilization Properties Mean 

The convergence and closure properties address achieving and maintaining network synchrony, 
respectively. As formally defined in the previous section, given sufficient time, C, the 
convergence property examines whether or not the system has reached a point where all nodes 
are within the specified precision. The closure property, on the other hand, examines whether or 
not the system starting within the specified precision will remain within that precision thereafter. 
The convergence and closure properties provide an external view of the system, whereby the 
external viewer can examine whether or not the system has self-stabilized. 

In safety-critical TDMA (Time Division Multiple Access) architectures [Kop 1997] [Min 2002] 
[Tor 2005A, 2005B], synchronization is the most crucial element of these systems. More 
precisely, TDMA-type applications are based on the fundamental assumption of the existence of 
initial synchrony. The protocol presented in this report is meant to provide this fundamental 
requirement of TDMA-type applications to higher-level protocols. However, one of the 
challenges in employing multiple protocols in distributed system has been the integration of 
these protocols operating at different levels of application. Previously, the integration of a 
lower-level protocol with higher-levels either has not been addressed or had simply been 
overlooked. The congruence property addresses this essential requirement. Unlike the 
convergence and closure properties that provide system view from the perspective of an external 
viewer, the congruence property provides a local view from the perspective of a node by 
providing the necessary and sufficient conditions for the node to locally detennine whether or 
not the system has converged. The congruence property, therefore, is essential in the integration 
of this underlying self-stabilization protocol with higher-level protocols in the system. 

The liveness property examines whether or not a node takes on all possible discrete values within 
an expected range. In other words, the system is “alive” and the good nodes execute the protocol 
properly, and time advances within each node. 

3.6. The Self-Stabilizing, Symmetric-Fault Tolerant Synchronization Protocol 

In this section, we present the self-stabilizing, symmetric-fault tolerant synchronization protocol 
that is based on message symmetry assumption. We mentioned earlier that in order to achieve 
and maintain synchrony, the nodes communicate by exchanging Sync messages. Assuming 
physical-layer error detection is dealt with separately, the reception of a Sync message is 
indicative of its validity in the value domain. Upon start of a new round of a resynchronization 
process, the node continually sends out Sync messages, once per y, to other nodes that are 
connected to it. Consequently, the life-span of a Sync message at the receiving nodes is set to be 
y. Also, we mentioned earlier that for tolerating symmetric faults, sufficiency for the Accept() 
function is detennined by T A - F D + F s + 1 . 

The protocol, executed by all good nodes, is presented in Figure 4 and consists of a synchronizer 
and a set of monitors which execute once every local clock tick. Four concurrent if statements 
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collectively describe the synchronizer. These statements are labeled ST ( StateTimer ), LT 
( LocalTimer ), TS (Transmit Sync), and TT ( TransmitTimer ). The function ValiclateMessage() 
describes the monitor. 


Synchronizer: 

ST 1 : if ( StateTimer < 0) or (Accept) )) 
StateTimer := 0, // reset 

ST2: elseif (StateTimer < P ST ) 

StateTimer := StateTimer + 1. 

TT1 : if (TransmitTimer < 0) or 

((TransmitTimer > y) and 
(StateTimer > P S t )) 

TransmitTimer := 0, 

TT 1 : elseif ( TransmitTimer < y) 

TransmitTimer := TransmitTimer + 1. 

LT1 : if (LocalTimer < 0) or 

(LocalTimer > P LT ) or 
(StateTimer = [ u init 1) 
LocalTimer := 0, // reset 

LT2: else 

LocalTimer := LocalTimer + 1. 

TS 1 : if (StateTimer > P ST ) and // timed out 
(TransmitTimer > y) and 
(not Accept ( )) 

Transmit Sync. 

Monitor: 

ValidateMessage(). 


Figure 4. The Symmetric-Fault Tolerant Protocol. 


The following is a list of pertinent protocol measures. 

K > 2F S + 1 , where F s is the maximum number of simultaneous symmetrically faulty nodes 
S(Pst) denotes the maximum drift for the duration of Pst , S(Pst) > 0 

0 < p « 1 

0<D<y « P st ^ P lt 
0 < StateTimer < Pst 
0 < LocalTimer < Pit 
T tinit = d + y+ S(d+y) 

u = ninit + 2 S(Pst) > 0, for all t> C, and so, 0 < u « Pst 

t rp = u + 2y+ UMt , where, t rp denotes duration of the resynchronization process during 
steady state. 

Plt ^ P st + trp = P st + u + 2y+ UM t 
C = Plt + ResetLocalTimerAt + 2 y 

Since 0 < y « P S t < Plt, and the LocalTimer is reset after reaching P LT (worst-case 
wraparound), a trivial solution is not possible. 

Appendix A provides an example to give the reader a quick review and help in understanding of 
the behavior of the protocol. 
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3.7. Determining Protocol Parameters 

We refer to p, d, D, K, T, and Pst as the fundamental protocol parameters and the remaining as 
the derived parameters. In this section, we show how the derived protocol parameters are 
computed. 

7tj„j, - The initial precision, nMt, is the maximum difference between StateTimers of any two 
good nodes during steady sate, for all t > C, and upon completion of a resynchronization process. 
Thus, as depicted in Figure 5, n init = d+ /+ 8(d+y). In this figure, transmitted Sync messages are 
shown using ‘f, received Sync messages using ‘ f’, and the accept events are marked by on 
the time axis. 


12 1 f 

N1 _t 11 j J ]_time 


N2 
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Figure 5. Network precision. 
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Figure 6.b. Network precision. 


5(P st ) 


n - From the definition of the network precision, tc, it follows that, for all t > C, n is the sum of 
initial precision and the maximum drift among the good nodes after Pst from the completion of 
the resynchronization process. Thus, as depicted in Figure 6, even when the nodes start in 
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perfect synchrony, n = 2S(Psr) and since in the worst case they start within n therefore, for the 
worst case, n = n init + 2 d(P ST ). 


t J:]l - From the definition of the resynchronization process, it follows that, during steady state it 
takes 7 r ticks for all good nodes to time out and begin transmitting Sync messages. It takes y 
ticks for the transmitted messages, to reach other good nodes and result in subsequent accept 
events at all good nodes. Since a Sync message has a life-span of one y, subsequent accept 
events occur within the next y. At the end of the resynchronization process, the good nodes are 
within 7Tj n i, ticks of each other. Thus, t rp = n + 2 y+ Ttmu- 

Pi t - The value of Plt is derived from the behavior of the network during steady state and it is a 
measure of the worst case scenario between two consecutive resets of the LocalTimer of a good 
node. Thus, P L t > Pst + t rp = Pst +n + 2y+ n ilul . 

C - The convergence time, C, is measured from t 0 . Its value is the sum of one y due to 
randomness in the initial value of the MessageTimer in the good nodes, plus the worst case 
scenario for the good nodes undergoing a resynchronization process, i.e., P ST , and finally 
converge to the predicted precision n. Therefore, C = y + Pst + t rp + ResetLocalTimerAt, and so, 

C y+ P ST l rp "P ftinit- 


4. Verification Of The Correctness Of The Protocol Via Model Checking 

In this section we present a mechanical verification of the protocol using the model checking 
approach for its ease, feasibility, and quick examination of the problem space, while later 
attempting a more comprehensive proof via theorem proving. The details of the model checking 
effort are similar to [Mai 2012], where the models of the system components and their data 
structures are fully described and similar abstractions are employed with respect to the size of the 
model and real-time delays. We do not restate the details of model checking effort here. Instead, 
we focus on the model checking results. Similar to [Mai 2012], the Symbolic Model Verifier 
(SMV) was used in the modeling of this protocol on a PC with 4GB of memory running Linux 
[Linux]. SMV’s language description and modeling capability provide relatively easy translation 
from the pseudo-code. SMV semantics are synchronous composition, where all assignments are 
executed in parallel and synchronously. Thus, a single step of the resulting model corresponds to 
a step in each of the components. 

The protocol described in this report is fairly subtle and must necessarily cope with many kinds 
of timing behaviors. Model checking has been used to explore and verify distributed algorithms 
but also faces certain difficulties [Ste 2004, 2011][Lon 1007][Mal 2008]. One of the foremost 
challenges is a realistic representation of time as a continuous variable. As we elaborated earlier 
in this report, although the network level measurements are real values, locally and at the node 
level, all parameters are discrete. The discretization is used for practical purposes in 
implementing and model checking the protocol. Since continuous time modeling is 
impracticable, we used the same abstractions as in [Mai 2012] for discrete time. 
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4.1. Propositions 

Computational tree logic (CTL), a temporal logic, is used to express properties of a system. In 
CTL formulas are composed of path quantifiers, E and A, and temporal operators, X, F, G, 
and U [Cla 1981], In this section the claims of convergence, closure, congruence, and liveness 
properties as well as the claims of maximum convergence time and determinism of the protocol 
are examined. Although in the description of the protocol convergence and closure properties 
are stated separately, they are examined via one CTL proposition. This proposition also 
expresses the claims of determinism and linear convergence. Validation of this general CTL 
proposition requires examination of a number of underlying propositions. In particular, since 
A LocaiTimer(t ) is defined in tenns of the LocalTimer of the nodes, examination of the properties that 
describe proper behavior of the LocalTimer take precedence. The variable ElapsedTime is used 
in these properties and is defined here. 

ElapsedTime = (GlobalClock > ConvergenceTime) ; 

The GlobalClock is a measure of elapsed time from the beginning of the operation with respect 
to the real time, i.e., external view. The ElapsedTime is indicative of the GlobalClock reaching 
its target maximum value of ConvergenceTime . 

Proposition SystemLiveness: This property addresses the liveness property of the system by 
examining whether or not time advances and the amount of time elapsed, ElapsedTime, has 
advanced beyond the predicted convergence time, ConvergenceTime. 


AF ( ElapsedTime ) 

Proposition ConvergenceAndClosure: This proposition encompasses the criteria for the 
convergence and the closure properties as well as the claims of maximum convergence time and 
determinism. The proposition specifies whether or not the system will converge to the predicted 
precision after the elapse of convergence time, ElapsedTime, and whether or not it will remain 
within that precision thereafter. This and subsequent properties are expected to hold. 

The proper value of the AllWithinPrecision is detennined by measuring the difference between 
the maximum and minimum values of the LocalTimers of all nodes for the current tick and in 
conjunction with the result from the previous r = \n (1 + p) \ ticks. The expected difference of 
LocalTimers is the predicted precision bound. 

— Determinism Property 
AF ( ElapsedTime ) A 

— Convergence Property 

AG ( ElapsedTime — > AllWithinPrecision) A 

— Closure Property 

AG ((ElapsedTime A AllWithinPrecision) — > AX (ElapsedTime A AllWithinPrecision)) 


To eliminate trivial results and false positives, the following proposition is examined, and the 
expected result is a value of false. This property specifies that after the elapse of convergence 
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time, ElapseclTime, whether or not the system will not converge or if it converges, whether or not 
it drifts apart beyond the expected precision bound. 


AF (ElapsedTime) A 

AG (ElapsedTime — * AllWithinPrecision ) A AG ((ElapsedTime A AllWithinPrecision ) 
— > EX AllWithinPrecision )) 


Proposition Congruence : This property specifies the criteria for the congruence property of the 
protocol. Unlike the convergence and closure properties that provide system view from the 
perspective of an external viewer, the congruence property provides a local view from the 
perspective of a node by providing necessary and sufficient conditions for the node to locally 
detennine whether or not the system has converged. The congruence property is essential in the 
integration of this underlying self-stabilization protocol with higher level protocols in the system. 
The congruence property is described with respect to only one node, namely Node_l. Since all 
nodes are symmetric, the result of the proposition equally applies to other nodes. 


AF (ElapsedTime) A AG ((ElapsedTime A (Node_l.LocalTimer= \tc\)) 
— > AX (ElapsedTime A AllWithinPrecision)) 


Proposition ProtocolLiveness: This property specifies the criteria for the liveness property of 
the protocol. The property examines whether or not a node takes on all discrete values within an 
expected range. Again, since all nodes are symmetric, this property is described with respect to 
only one node, namely Node_l . 


AF ( ElapsedTime ) A 

AG (((ElapsedTime) A (Node_l .LocalTimer = i)) 

— > AX ((Node_l. LocalTimer = i) I (Node _1. LocalTimer = i+1 ))) A 
AG (((ElapsedTime) A (Node_l.LocalTimer= P LT )) 

— > AX (Node _1 .LocalTimer = 0)) 

For cdl i = 0 .. (P ST - n - y) 


The model checking results of the bounded model of the protocol have verified the correctness of 
the protocol for fully connected networks with K > 2 Fs + 1 nodes, starting from an arbitrary 
state, and for the following scenarios. F s = 0, 1, 2, 3, simultaneous symmetric faults, 0 <p« 1, 
D = 1 and d = 0. F s = 2 simultaneous symmetric faults, 0 < p « 1, D = 2, 3, and d = 0,1. In 
addition, the results have confirmed the claims of detenninism and linear convergence. 


5. Conclusions 

Distributed systems have become an integral part of safety-critical computing applications, 
necessitating system designs that incorporate complex fault-tolerant, resource-management 
functions to provide globally coordinated operations with ultra-reliability. As a result, a fault- 
tolerant system needs a clock synchronization algorithm that tolerates imprecise local clocks and 
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faulty behavior by some processes. In this report we presented a strategy for synchronizing a 
distributed system in the presence of various faults, including any number of arbitrary 
(Byzantine) faults. The main issue in solving the self-stabilization problem is a lack of a 
symmetric view in the system by the participating good nodes. Thus, the crux of our idea was to 
first convert any message to a symmetric message and then use a verified protocol, based on 
message symmetry assumption, to solve the synchronization problem. We first enumerated 
several ways of achieving message symmetry across the system, and then presented a new 
protocol based on message symmetry assumption. We also presented a mechanical verification 
of the protocol for up to three simultaneous, symmetric faults. The model-checking effort was 
focused on verifying the correctness of a bounded model of the protocol as well as confirming 
claims of detenninism and linear convergence with respect to the self-stabilization period. As a 
result, we believe that our solution solves the general case of this problem for fully connected 
graphs. We leave however the generalization of our solution to other topologies, including an 
arbitrary graph that meets the minimum requirements of number of nodes and connectivity, to 
future works. 

The proposed self-stabilizing protocol is expected to have many practical applications as well as 
many theoretical implications. GPS (Global Positioning System) denied environment or where 
GPS is non-existent (e.g., Mars mission), embedded systems, power grid, distributed process 
control, synchronization, computer networks, the Internet, Internet applications, security, safety, 
automotive, aircraft, distributed air traffic management systems, swarm systems, wired and 
wireless telecommunications, graph theoretic problems, leader election, TDMA (time division 
multiple access), and banking and commerce are a few examples. These are some of the many 
areas of distributed systems that can use synchronization in order to design more robust 
distributed systems. 
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Appendix A 


The purpose of this example is to give the reader a quick review of and help in understanding the 
behavior of the protocol. The following is an example of a fully connected graph consisting of 5 
nodes, where F = 2. Table A.l shows an execution trace of the system and has eight columns; 
one for time reference, two for each good node listing values for the StateTimer and LocalTimer, 
and the last column is for network precision, n. Each row depicts activities of all good nodes at 
the corresponding time. Cell contents for the node columns consist of a number corresponding 
to the value of the StateTimer of the node in conjunction with an activity: 1) Sync if the node 
transmits the message, and 2) Accept if the node received T A messages. The received messages 
at a node are depicted in superscripts, one position for each corresponding node, where a 
means no messages from that node and an ‘x’ means a Sync message was received. 

This table depicts activities of the network during a re synchronization process when the network 
is in steady state. Even though the good nodes started the cycle in synchrony, they gradually 
drifted apart. The table shows a scenario where node 1 is the fastest and node 3 the slowest of 
the good nodes and by the end of the synchronization period they have drifted part by as much as 
12 clock ticks from an external perspective. Since the faulty nodes can transmit messages at any 
time, their activities are not listed in the table. However, their messages are recorded at the 
receiving good nodes. For instance, at (t + 8) a message from node 5 (a faulty node) is received 
by nodes 1 and 2 and d ticks later node 3 records receiving the same message. The columns 
representing LocalTimer values are shaded gray for visual purposes. The V column shows that 
although the instantaneous differences between the LocalTimers spike up to a value of 999 at 
(t + 17), the precision n as defined in Section 3.4 remains within the theoretical predicted value 
of 16. 
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System parameters: 

D = 3 clock ticks, d = 1 clock tick -> 7 = 4 clock ticks 

AT = 5 nodes, G = 3 nodes, F = 2 nodes T A = 3 nodes 

Psr = 1000 clock ticks 

0 < p < 1 -> 0 < S(P ST ) < 5 clock ticks 

TTinit = d+ y+ S(d + y) n init = 6 clock ticks 

n = 7i i nit + 23(P S t) > 0 -> n = 16 clock ticks 

r = \n (1 + p)l = 17 clock ticks 

t, v = 7 T + 2;/+ jTj n j, -> t rp = 30 clock ticks 

Put > Psr + t rp -> P LT = 1030 clock ticks 

ResetLocalTimerAt = 7 T, m; ResetLocalTimerAt = 6 clock ticks 
C = Plt + ResetLocalTimerAt + 2/-^ C = 1044 clock ticks 


Table A. 1 . An execution trace of a network of 5 nodes. 
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